Data Quality and Curation
نویسنده
چکیده
Data quality is an issue that touches on every aspect of the research data landscape and is therefore appropriate to examine in the context of planning for future research data infrastructures. As producers, researchers want to believe that they produce high quality data; as consumers, they want to obtain data of the highest quality. Data centres typically have stringent controls to ensure that they only acquire and disseminate data of the highest quality. Data managers will usually say that they improve the quality of the data they are responsible for. Much of the infrastructure that will emit, transform, integrate, visualise, manage, analyse, and disseminate data during its life will have dependencies, explicit or implicit, on the quality of the data it is dealing with.
منابع مشابه
Study of the foundation, models and issues of research data curation and management in scientific and academic environments
Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...
متن کاملGenomics Data Curation Roles, Skills, and Perception of Data Quality
Compared to a decade ago, genomics scientists, driven by technical changes and availability of massive genomic data, are performing a wider plurality of curation roles including those of end-users, curators, or dual-role users. Scientists with different curation roles (including that of end user) may focus on different data quality aspects and skills requirements in a community curation environ...
متن کاملBig Data to Knowledge—Harnessing Semiotic Relationships of Data Quality and Skills in Genome Curation Work
This article aims to understand the views of genomics scientists with regard to the data quality assurances associated with semiotics and Data-Information-Knowledge (DIK). The resulting communication of signs generated from genomic curation work, was found within different semantic levels of DIK that correlate specific data quality dimensions with their respective skills. Syntactic DQ dimension...
متن کاملLong-term Digital Metadata Curation
The rapid increase in data volume and data availability along with the need for continual quality assured searching and indexing information of such data requires efficient and effective metadata management strategies. From this perspective, the necessity for adequate, well-managed and high quality Metadata is becoming increasingly essential for successful long-term high quality data preservati...
متن کاملLenses: An On-Demand Approach to ETL
Three mentalities have emerged in analytics. One view holds that reliable analytics is impossible without high-quality data, and relies on heavy-duty ETL processes and upfront data curation to provide it. The second view takes a more ad-hoc approach, collecting data into a data lake, and placing responsibility for data quality on the analyst querying it. A third, on-demand approach has emerged ...
متن کاملDomain knowledge and data quality perceptions in genome curation work
Purpose-This article aims at understanding genomics scientists' perceptions in data quality assurances based on their domain knowledge. Design/methodology/approach-The study used a survey method to collect responses from 149 genomics scientists grouped by domain knowledge. They ranked the top-five quality criteria based on hypothetical curation scenarios. The results were compared using Chi-Squ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Data Science Journal
دوره 12 شماره
صفحات -
تاریخ انتشار 2013